Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 44444 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 4.4 MiB |
| Average record size in memory | 104.0 B |
Variable types
| Numeric | 8 |
|---|---|
| Categorical | 5 |
imp_hash has constant value "fbcff5951ad0c204f4744c629548c6c6" | Constant |
filename has a high cardinality: 383 distinct values | High cardinality |
sha256 has a high cardinality: 872 distinct values | High cardinality |
sec_md5 has a high cardinality: 176 distinct values | High cardinality |
sec_name has a high cardinality: 1495 distinct values | High cardinality |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_chi2 is highly correlated with sec_entropy | High correlation |
sec_entropy is highly correlated with sec_chi2 and 1 other fields | High correlation |
raw_size is highly correlated with virtual_size | High correlation |
virtual_size is highly correlated with raw_size | High correlation |
virtual_address is highly correlated with sec_entropy | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_entropy is highly correlated with raw_size and 2 other fields | High correlation |
raw_size is highly correlated with sec_entropy and 1 other fields | High correlation |
virtual_size is highly correlated with sec_entropy and 1 other fields | High correlation |
virtual_address is highly correlated with sec_entropy | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
raw_size is highly correlated with virtual_size | High correlation |
virtual_size is highly correlated with raw_size | High correlation |
df_index is highly correlated with Unnamed: 0 and 1 other fields | High correlation |
Unnamed: 0 is highly correlated with df_index and 1 other fields | High correlation |
win_count is highly correlated with df_index and 1 other fields | High correlation |
sec_chi2 is highly correlated with raw_size and 1 other fields | High correlation |
sec_entropy is highly correlated with raw_size and 2 other fields | High correlation |
raw_size is highly correlated with sec_chi2 and 3 other fields | High correlation |
virtual_size is highly correlated with sec_chi2 and 3 other fields | High correlation |
virtual_address is highly correlated with sec_entropy and 2 other fields | High correlation |
df_index has unique values | Unique |
Unnamed: 0 has unique values | Unique |
sec_entropy has 37863 (85.2%) zeros | Zeros |
Reproduction
| Analysis started | 2022-09-05 02:05:35.414278 |
|---|---|
| Analysis finished | 2022-09-05 02:05:43.411799 |
| Duration | 8 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
df_index
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 44444 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2755043.742 |
| Minimum | 5245 |
|---|---|
| Maximum | 5555048 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 5245 |
|---|---|
| 5-th percentile | 702247.15 |
| Q1 | 1021131.75 |
| median | 2007641.5 |
| Q3 | 4764338.25 |
| 95-th percentile | 5057719.85 |
| Maximum | 5555048 |
| Range | 5549803 |
| Interquartile range (IQR) | 3743206.5 |
Descriptive statistics
| Standard deviation | 1751777.652 |
|---|---|
| Coefficient of variation (CV) | 0.6358438616 |
| Kurtosis | -1.718630857 |
| Mean | 2755043.742 |
| Median Absolute Deviation (MAD) | 1121038 |
| Skewness | 0.1705083425 |
| Sum | 1.224451641 × 1011 |
| Variance | 3.068724942 × 1012 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5245 | 1 | < 0.1% |
| 4546072 | 1 | < 0.1% |
| 4541475 | 1 | < 0.1% |
| 4541476 | 1 | < 0.1% |
| 4541477 | 1 | < 0.1% |
| 4541478 | 1 | < 0.1% |
| 4541479 | 1 | < 0.1% |
| 4546069 | 1 | < 0.1% |
| 4546070 | 1 | < 0.1% |
| 4546071 | 1 | < 0.1% |
| Other values (44434) | 44434 |
| Value | Count | Frequency (%) |
| 5245 | 1 | |
| 5246 | 1 | |
| 5247 | 1 | |
| 5248 | 1 | |
| 5249 | 1 | |
| 5250 | 1 | |
| 5251 | 1 | |
| 5252 | 1 | |
| 5253 | 1 | |
| 5254 | 1 |
| Value | Count | Frequency (%) |
| 5555048 | 1 | |
| 5555047 | 1 | |
| 5555046 | 1 | |
| 5555045 | 1 | |
| 5555044 | 1 | |
| 5555043 | 1 | |
| 5555042 | 1 | |
| 5555041 | 1 | |
| 5555040 | 1 | |
| 5555039 | 1 |
Unnamed: 0
Real number (ℝ≥0)
HIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONHIGH CORRELATIONUNIQUE| Distinct | 44444 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2755043.742 |
| Minimum | 5245 |
|---|---|
| Maximum | 5555048 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 5245 |
|---|---|
| 5-th percentile | 702247.15 |
| Q1 | 1021131.75 |
| median | 2007641.5 |
| Q3 | 4764338.25 |
| 95-th percentile | 5057719.85 |
| Maximum | 5555048 |
| Range | 5549803 |
| Interquartile range (IQR) | 3743206.5 |
Descriptive statistics
| Standard deviation | 1751777.652 |
|---|---|
| Coefficient of variation (CV) | 0.6358438616 |
| Kurtosis | -1.718630857 |
| Mean | 2755043.742 |
| Median Absolute Deviation (MAD) | 1121038 |
| Skewness | 0.1705083425 |
| Sum | 1.224451641 × 1011 |
| Variance | 3.068724942 × 1012 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 5245 | 1 | < 0.1% |
| 4546072 | 1 | < 0.1% |
| 4541475 | 1 | < 0.1% |
| 4541476 | 1 | < 0.1% |
| 4541477 | 1 | < 0.1% |
| 4541478 | 1 | < 0.1% |
| 4541479 | 1 | < 0.1% |
| 4546069 | 1 | < 0.1% |
| 4546070 | 1 | < 0.1% |
| 4546071 | 1 | < 0.1% |
| Other values (44434) | 44434 |
| Value | Count | Frequency (%) |
| 5245 | 1 | |
| 5246 | 1 | |
| 5247 | 1 | |
| 5248 | 1 | |
| 5249 | 1 | |
| 5250 | 1 | |
| 5251 | 1 | |
| 5252 | 1 | |
| 5253 | 1 | |
| 5254 | 1 |
| Value | Count | Frequency (%) |
| 5555048 | 1 | |
| 5555047 | 1 | |
| 5555046 | 1 | |
| 5555045 | 1 | |
| 5555044 | 1 | |
| 5555043 | 1 | |
| 5555042 | 1 | |
| 5555041 | 1 | |
| 5555040 | 1 | |
| 5555039 | 1 |
| Distinct | 383 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 347.3 KiB |
| 2022041900/2022041900_47 | 850 |
|---|---|
| 2022041900/2022041900_11 | 550 |
| 2022041900/2022041900_40 | 521 |
| 2022041900/2022041900_12 | 500 |
| 2022041900/2022041900_32 | 500 |
| Other values (378) |
Length
| Max length | 33 |
|---|---|
| Median length | 24 |
| Mean length | 24.2697552 |
| Min length | 23 |
Characters and Unicode
| Total characters | 1078645 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 3 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 20220329/2022032900/2022032900_10 |
|---|---|
| 2nd row | 20220329/2022032900/2022032900_10 |
| 3rd row | 20220329/2022032900/2022032900_10 |
| 4th row | 20220329/2022032900/2022032900_10 |
| 5th row | 20220329/2022032900/2022032900_10 |
Common Values
| Value | Count | Frequency (%) |
| 2022041900/2022041900_47 | 850 | 1.9% |
| 2022041900/2022041900_11 | 550 | 1.2% |
| 2022041900/2022041900_40 | 521 | 1.2% |
| 2022041900/2022041900_12 | 500 | 1.1% |
| 2022041900/2022041900_32 | 500 | 1.1% |
| 2022041900/2022041900_56 | 500 | 1.1% |
| 2022041900/2022041900_55 | 450 | 1.0% |
| 2022041901/2022041901_1 | 450 | 1.0% |
| 2022041921/2022041921_52 | 445 | 1.0% |
| 2022041922/2022041922_1 | 437 | 1.0% |
| Other values (373) | 39241 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 2022041900/2022041900_47 | 850 | 1.9% |
| 2022041900/2022041900_11 | 550 | 1.2% |
| 2022041900/2022041900_40 | 521 | 1.2% |
| 2022041900/2022041900_12 | 500 | 1.1% |
| 2022041900/2022041900_32 | 500 | 1.1% |
| 2022041900/2022041900_56 | 500 | 1.1% |
| 2022041900/2022041900_55 | 450 | 1.0% |
| 2022041901/2022041901_1 | 450 | 1.0% |
| 2022041921/2022041921_52 | 445 | 1.0% |
| 2022041922/2022041922_1 | 437 | 1.0% |
| Other values (373) | 39241 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 338022 | |
| 0 | 267358 | |
| 1 | 128205 | 11.9% |
| 4 | 98327 | 9.1% |
| 9 | 95988 | 8.9% |
| / | 46795 | 4.3% |
| _ | 44444 | 4.1% |
| 3 | 24424 | 2.3% |
| 5 | 15230 | 1.4% |
| 7 | 8475 | 0.8% |
| Other values (2) | 11377 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 987406 | |
| Other Punctuation | 46795 | 4.3% |
| Connector Punctuation | 44444 | 4.1% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 338022 | |
| 0 | 267358 | |
| 1 | 128205 | 13.0% |
| 4 | 98327 | 10.0% |
| 9 | 95988 | 9.7% |
| 3 | 24424 | 2.5% |
| 5 | 15230 | 1.5% |
| 7 | 8475 | 0.9% |
| 6 | 6107 | 0.6% |
| 8 | 5270 | 0.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| / | 46795 |
Connector Punctuation
| Value | Count | Frequency (%) |
| _ | 44444 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1078645 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 338022 | |
| 0 | 267358 | |
| 1 | 128205 | 11.9% |
| 4 | 98327 | 9.1% |
| 9 | 95988 | 8.9% |
| / | 46795 | 4.3% |
| _ | 44444 | 4.1% |
| 3 | 24424 | 2.3% |
| 5 | 15230 | 1.4% |
| 7 | 8475 | 0.8% |
| Other values (2) | 11377 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1078645 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 338022 | |
| 0 | 267358 | |
| 1 | 128205 | 11.9% |
| 4 | 98327 | 9.1% |
| 9 | 95988 | 8.9% |
| / | 46795 | 4.3% |
| _ | 44444 | 4.1% |
| 3 | 24424 | 2.3% |
| 5 | 15230 | 1.4% |
| 7 | 8475 | 0.8% |
| Other values (2) | 11377 | 1.1% |
| Distinct | 917 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 303154.1374 |
| Minimum | 991 |
|---|---|
| Maximum | 591567 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 991 |
|---|---|
| 5-th percentile | 97004 |
| Q1 | 179444 |
| median | 279558 |
| Q3 | 453279 |
| 95-th percentile | 499902 |
| Maximum | 591567 |
| Range | 590576 |
| Interquartile range (IQR) | 273835 |
Descriptive statistics
| Standard deviation | 141769.3064 |
|---|---|
| Coefficient of variation (CV) | 0.4676476054 |
| Kurtosis | -1.270612918 |
| Mean | 303154.1374 |
| Median Absolute Deviation (MAD) | 110089 |
| Skewness | -0.01351828215 |
| Sum | 1.347338248 × 1010 |
| Variance | 2.009853624 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 379482 | 50 | 0.1% |
| 297959 | 50 | 0.1% |
| 255606 | 50 | 0.1% |
| 268503 | 50 | 0.1% |
| 268519 | 50 | 0.1% |
| 269540 | 50 | 0.1% |
| 269552 | 50 | 0.1% |
| 269969 | 50 | 0.1% |
| 278964 | 50 | 0.1% |
| 279072 | 50 | 0.1% |
| Other values (907) | 43944 |
| Value | Count | Frequency (%) |
| 991 | 20 | < 0.1% |
| 1608 | 48 | |
| 2018 | 50 | |
| 2218 | 50 | |
| 3896 | 50 | |
| 4222 | 50 | |
| 4232 | 50 | |
| 5000 | 50 | |
| 5268 | 50 | |
| 5481 | 50 |
| Value | Count | Frequency (%) |
| 591567 | 50 | |
| 587420 | 48 | |
| 575750 | 50 | |
| 543451 | 50 | |
| 537175 | 50 | |
| 528031 | 50 | |
| 525450 | 45 | |
| 524104 | 33 | |
| 519105 | 50 | |
| 518853 | 50 |
| Distinct | 872 |
|---|---|
| Distinct (%) | 2.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 347.3 KiB |
| 0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a | 150 |
|---|---|
| b2fe2aa2a8809247417e0f33c35e0c808682ad46b5f6488987787c6c0052fd87 | 100 |
| b12699ae963bb426fbdcf4b69f08caec6c32213bf6a1a7b9201c08f908c32471 | 100 |
| 3db5f4cc1671c074697105754869e2ee80a830de3b41a6a5364415850d98ab87 | 100 |
| 79ef584a41008e2a42b6c48fa4a95eed727a0aa6a932d9dbf0f0d29f5c509c0b | 100 |
| Other values (867) |
Length
| Max length | 64 |
|---|---|
| Median length | 64 |
| Mean length | 64 |
| Min length | 64 |
Characters and Unicode
| Total characters | 2844416 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 |
|---|---|
| 2nd row | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 |
| 3rd row | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 |
| 4th row | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 |
| 5th row | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 |
Common Values
| Value | Count | Frequency (%) |
| 0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a | 150 | 0.3% |
| b2fe2aa2a8809247417e0f33c35e0c808682ad46b5f6488987787c6c0052fd87 | 100 | 0.2% |
| b12699ae963bb426fbdcf4b69f08caec6c32213bf6a1a7b9201c08f908c32471 | 100 | 0.2% |
| 3db5f4cc1671c074697105754869e2ee80a830de3b41a6a5364415850d98ab87 | 100 | 0.2% |
| 79ef584a41008e2a42b6c48fa4a95eed727a0aa6a932d9dbf0f0d29f5c509c0b | 100 | 0.2% |
| 11e6868502b491a93475f42a8f5f3bf1cfd635820df750530618cf1849f1307d | 100 | 0.2% |
| 7f6a104a98d7400c2adf22aa5407ac7a8854342ca1ba34512b8890c5a4201ad1 | 100 | 0.2% |
| 1fccb0fd26e47b82cb0522a39873711ea023e2dbf66e3c6e0435d267454485e7 | 100 | 0.2% |
| d0e155e69192557cb2aa2095d814fa80cbfb1ce82a75ef611257f464ec771768 | 100 | 0.2% |
| 7684f8426a58c1f902e13e25c68b89a3f06a5418370a3f2aae2ba3cc3f62ebae | 100 | 0.2% |
| Other values (862) | 43394 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 0ad188c147c18398d5ee2e54be5ce6a3c196ed247a352851e22e51d3a9d92c7a | 150 | 0.3% |
| 27a3dd3765f237db5971b5d1a599ade0f2aafd20691bd0bb89b5d28e0009ab5a | 100 | 0.2% |
| a80c2a2705371649dfcc32469b8fd69004a28d14847c7cbc2e294a38bdb5499d | 100 | 0.2% |
| 312e6afc01e161f899ef512391d4286d0dbfdd792b237cc9323de7bbf28a08c9 | 100 | 0.2% |
| ea2c57449ea90302b27b94e28702e8e6710196ab461c630e92c3c715930bde82 | 100 | 0.2% |
| 41422355eb3048b27e36b590dcd690ba74b5e776099e9abdcdb56c6a665877c4 | 100 | 0.2% |
| a0127c5d5f8788e78fbf73149e2b529bf2e7600010f51dd7b502e4228b4f6765 | 100 | 0.2% |
| 15ef164c73d7770d6b96adf1a1cd8274dd3b53f26bb7fb10aee4f3a8b02fc264 | 100 | 0.2% |
| 7421988e586066bc9539e680a9622d4d084bbd71182eb32f24af29c2659cd374 | 100 | 0.2% |
| 152b4f077b594ad4fb17282643b4a30e91e41b98f19e4fde047877e5a9dda473 | 100 | 0.2% |
| Other values (862) | 43394 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 186358 | 6.6% |
| 9 | 183320 | 6.4% |
| f | 180164 | 6.3% |
| 3 | 180065 | 6.3% |
| 0 | 179588 | 6.3% |
| 8 | 178593 | 6.3% |
| 5 | 178294 | 6.3% |
| a | 177140 | 6.2% |
| 6 | 176776 | 6.2% |
| d | 176279 | 6.2% |
| Other values (6) | 1047839 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1785270 | |
| Lowercase Letter | 1059146 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 186358 | |
| 9 | 183320 | |
| 3 | 180065 | |
| 0 | 179588 | |
| 8 | 178593 | |
| 5 | 178294 | |
| 6 | 176776 | |
| 7 | 175823 | |
| 4 | 174795 | |
| 1 | 171658 |
Lowercase Letter
| Value | Count | Frequency (%) |
| f | 180164 | |
| a | 177140 | |
| d | 176279 | |
| e | 175707 | |
| c | 175081 | |
| b | 174775 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1785270 | |
| Latin | 1059146 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 186358 | |
| 9 | 183320 | |
| 3 | 180065 | |
| 0 | 179588 | |
| 8 | 178593 | |
| 5 | 178294 | |
| 6 | 176776 | |
| 7 | 175823 | |
| 4 | 174795 | |
| 1 | 171658 |
Latin
| Value | Count | Frequency (%) |
| f | 180164 | |
| a | 177140 | |
| d | 176279 | |
| e | 175707 | |
| c | 175081 | |
| b | 174775 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2844416 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 186358 | 6.6% |
| 9 | 183320 | 6.4% |
| f | 180164 | 6.3% |
| 3 | 180065 | 6.3% |
| 0 | 179588 | 6.3% |
| 8 | 178593 | 6.3% |
| 5 | 178294 | 6.3% |
| a | 177140 | 6.2% |
| 6 | 176776 | 6.2% |
| d | 176279 | 6.2% |
| Other values (6) | 1047839 |
| Distinct | 1 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 347.3 KiB |
| fbcff5951ad0c204f4744c629548c6c6 |
|---|
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 1422208 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | fbcff5951ad0c204f4744c629548c6c6 |
|---|---|
| 2nd row | fbcff5951ad0c204f4744c629548c6c6 |
| 3rd row | fbcff5951ad0c204f4744c629548c6c6 |
| 4th row | fbcff5951ad0c204f4744c629548c6c6 |
| 5th row | fbcff5951ad0c204f4744c629548c6c6 |
Common Values
| Value | Count | Frequency (%) |
| fbcff5951ad0c204f4744c629548c6c6 | 44444 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| fbcff5951ad0c204f4744c629548c6c6 | 44444 |
Most occurring characters
| Value | Count | Frequency (%) |
| c | 222220 | |
| 4 | 222220 | |
| f | 177776 | |
| 5 | 133332 | |
| 6 | 133332 | |
| 9 | 88888 | 6.2% |
| 0 | 88888 | 6.2% |
| 2 | 88888 | 6.2% |
| b | 44444 | 3.1% |
| 1 | 44444 | 3.1% |
| Other values (4) | 177776 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 888880 | |
| Lowercase Letter | 533328 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 4 | 222220 | |
| 5 | 133332 | |
| 6 | 133332 | |
| 9 | 88888 | 10.0% |
| 0 | 88888 | 10.0% |
| 2 | 88888 | 10.0% |
| 1 | 44444 | 5.0% |
| 7 | 44444 | 5.0% |
| 8 | 44444 | 5.0% |
Lowercase Letter
| Value | Count | Frequency (%) |
| c | 222220 | |
| f | 177776 | |
| b | 44444 | 8.3% |
| a | 44444 | 8.3% |
| d | 44444 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 888880 | |
| Latin | 533328 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 4 | 222220 | |
| 5 | 133332 | |
| 6 | 133332 | |
| 9 | 88888 | 10.0% |
| 0 | 88888 | 10.0% |
| 2 | 88888 | 10.0% |
| 1 | 44444 | 5.0% |
| 7 | 44444 | 5.0% |
| 8 | 44444 | 5.0% |
Latin
| Value | Count | Frequency (%) |
| c | 222220 | |
| f | 177776 | |
| b | 44444 | 8.3% |
| a | 44444 | 8.3% |
| d | 44444 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1422208 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| c | 222220 | |
| 4 | 222220 | |
| f | 177776 | |
| 5 | 133332 | |
| 6 | 133332 | |
| 9 | 88888 | 6.2% |
| 0 | 88888 | 6.2% |
| 2 | 88888 | 6.2% |
| b | 44444 | 3.1% |
| 1 | 44444 | 3.1% |
| Other values (4) | 177776 |
| Distinct | 175 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2203016.473 |
| Minimum | 57874.69 |
|---|---|
| Maximum | 73113600 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 57874.69 |
|---|---|
| 5-th percentile | 358466.16 |
| Q1 | 1044480 |
| median | 1044480 |
| Q3 | 2088960 |
| 95-th percentile | 7311360 |
| Maximum | 73113600 |
| Range | 73055725.31 |
| Interquartile range (IQR) | 1044480 |
Descriptive statistics
| Standard deviation | 6622403.846 |
|---|---|
| Coefficient of variation (CV) | 3.006061882 |
| Kurtosis | 103.8191048 |
| Mean | 2203016.473 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 9.999100205 |
| Sum | 9.791086414 × 1010 |
| Variance | 4.385623269 × 1013 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1044480 | 26991 | |
| 2088960 | 7405 | 16.7% |
| 7311360 | 3106 | 7.0% |
| 358466.16 | 917 | 2.1% |
| 2262192.25 | 917 | 2.1% |
| 962731.13 | 917 | 2.1% |
| 66544.3 | 917 | 2.1% |
| 842616.38 | 917 | 2.1% |
| 57874.69 | 917 | 2.1% |
| 951366.5 | 565 | 1.3% |
| Other values (165) | 875 | 2.0% |
| Value | Count | Frequency (%) |
| 57874.69 | 917 | |
| 66544.3 | 917 | |
| 105580.31 | 1 | < 0.1% |
| 106287.13 | 1 | < 0.1% |
| 109876.63 | 1 | < 0.1% |
| 109919.63 | 1 | < 0.1% |
| 113882.25 | 1 | < 0.1% |
| 113937.75 | 1 | < 0.1% |
| 114061.88 | 1 | < 0.1% |
| 114343.63 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 73113600 | 361 | 0.8% |
| 7311360 | 3106 | |
| 2262192.25 | 917 | 2.1% |
| 2248354 | 1 | < 0.1% |
| 2248330.5 | 1 | < 0.1% |
| 2248190.5 | 1 | < 0.1% |
| 2247969.5 | 1 | < 0.1% |
| 2246972.5 | 1 | < 0.1% |
| 2088960 | 7405 | |
| 2063693.38 | 1 | < 0.1% |
| Distinct | 102 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.6538920889 |
| Minimum | 0 |
|---|---|
| Maximum | 7.84 |
| Zeros | 37863 |
| Zeros (%) | 85.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 6.12 |
| Maximum | 7.84 |
| Range | 7.84 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 1.890411814 |
|---|---|
| Coefficient of variation (CV) | 2.891014964 |
| Kurtosis | 6.921051665 |
| Mean | 0.6538920889 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.892108035 |
| Sum | 29061.58 |
| Variance | 3.573656827 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 37863 | |
| 6.12 | 917 | 2.1% |
| 7.38 | 917 | 2.1% |
| 0.48 | 917 | 2.1% |
| 7.84 | 917 | 2.1% |
| 1.04 | 917 | 2.1% |
| 5.33 | 917 | 2.1% |
| 2.97 | 565 | 1.3% |
| 2.98 | 346 | 0.8% |
| 0.96 | 7 | < 0.1% |
| Other values (92) | 161 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 37863 | |
| 0.23 | 1 | < 0.1% |
| 0.32 | 1 | < 0.1% |
| 0.33 | 2 | < 0.1% |
| 0.48 | 917 | 2.1% |
| 0.52 | 1 | < 0.1% |
| 0.53 | 1 | < 0.1% |
| 0.63 | 2 | < 0.1% |
| 0.64 | 1 | < 0.1% |
| 0.86 | 6 | < 0.1% |
| Value | Count | Frequency (%) |
| 7.84 | 917 | |
| 7.38 | 917 | |
| 6.12 | 917 | |
| 5.69 | 5 | < 0.1% |
| 5.66 | 2 | < 0.1% |
| 5.43 | 1 | < 0.1% |
| 5.4 | 1 | < 0.1% |
| 5.33 | 917 | |
| 5.29 | 1 | < 0.1% |
| 5.28 | 3 | < 0.1% |
| Distinct | 176 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 347.3 KiB |
| 620f0b67a91f7f74151bc5be745b7110 | |
|---|---|
| 0829f71740aab1ab98b33eae21dee122 | |
| cf845a781c107ec1346e849c9dd1b7e8 | |
| a43aef6b6f939e7959b81ec2e8806ecb | 917 |
| 41432e60924ed4a91548fe43265fa3d1 | 917 |
| Other values (171) |
Length
| Max length | 32 |
|---|---|
| Median length | 32 |
| Mean length | 32 |
| Min length | 32 |
Characters and Unicode
| Total characters | 1422208 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 160 ? |
|---|---|
| Unique (%) | 0.4% |
Sample
| 1st row | a43aef6b6f939e7959b81ec2e8806ecb |
|---|---|
| 2nd row | 8d17b27fb2e22ce52dc7953821236946 |
| 3rd row | 41432e60924ed4a91548fe43265fa3d1 |
| 4th row | ee4f14b17ad059c6b83e4503a65b81f5 |
| 5th row | c694e1149d43f57e80047a0bc754f9d3 |
Common Values
| Value | Count | Frequency (%) |
| 620f0b67a91f7f74151bc5be745b7110 | 26991 | |
| 0829f71740aab1ab98b33eae21dee122 | 7405 | 16.7% |
| cf845a781c107ec1346e849c9dd1b7e8 | 3106 | 7.0% |
| a43aef6b6f939e7959b81ec2e8806ecb | 917 | 2.1% |
| 41432e60924ed4a91548fe43265fa3d1 | 917 | 2.1% |
| ee4f14b17ad059c6b83e4503a65b81f5 | 917 | 2.1% |
| c694e1149d43f57e80047a0bc754f9d3 | 917 | 2.1% |
| a12e0f6f3ca72d1748b5dbba51b45e19 | 917 | 2.1% |
| cf4f5746abe0554542e6173fc6438b6c | 917 | 2.1% |
| 8d17b27fb2e22ce52dc7953821236946 | 565 | 1.3% |
| Other values (166) | 875 | 2.0% |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| 620f0b67a91f7f74151bc5be745b7110 | 26991 | |
| 0829f71740aab1ab98b33eae21dee122 | 7405 | 16.7% |
| cf845a781c107ec1346e849c9dd1b7e8 | 3106 | 7.0% |
| a43aef6b6f939e7959b81ec2e8806ecb | 917 | 2.1% |
| 41432e60924ed4a91548fe43265fa3d1 | 917 | 2.1% |
| ee4f14b17ad059c6b83e4503a65b81f5 | 917 | 2.1% |
| c694e1149d43f57e80047a0bc754f9d3 | 917 | 2.1% |
| a12e0f6f3ca72d1748b5dbba51b45e19 | 917 | 2.1% |
| cf4f5746abe0554542e6173fc6438b6c | 917 | 2.1% |
| 8d17b27fb2e22ce52dc7953821236946 | 565 | 1.3% |
| Other values (166) | 875 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 191689 | |
| 7 | 171519 | |
| b | 147761 | |
| 0 | 109599 | |
| f | 105021 | 7.4% |
| 5 | 102310 | 7.2% |
| 4 | 94223 | 6.6% |
| e | 84046 | 5.9% |
| a | 72273 | 5.1% |
| 6 | 71850 | 5.1% |
| Other values (6) | 271917 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 940017 | |
| Lowercase Letter | 482191 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 191689 | |
| 7 | 171519 | |
| 0 | 109599 | |
| 5 | 102310 | |
| 4 | 94223 | |
| 6 | 71850 | 7.6% |
| 2 | 68366 | 7.3% |
| 9 | 61006 | 6.5% |
| 8 | 38008 | 4.0% |
| 3 | 31447 | 3.3% |
Lowercase Letter
| Value | Count | Frequency (%) |
| b | 147761 | |
| f | 105021 | |
| e | 84046 | |
| a | 72273 | |
| c | 50189 | 10.4% |
| d | 22901 | 4.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 940017 | |
| Latin | 482191 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 191689 | |
| 7 | 171519 | |
| 0 | 109599 | |
| 5 | 102310 | |
| 4 | 94223 | |
| 6 | 71850 | 7.6% |
| 2 | 68366 | 7.3% |
| 9 | 61006 | 6.5% |
| 8 | 38008 | 4.0% |
| 3 | 31447 | 3.3% |
Latin
| Value | Count | Frequency (%) |
| b | 147761 | |
| f | 105021 | |
| e | 84046 | |
| a | 72273 | |
| c | 50189 | 10.4% |
| d | 22901 | 4.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1422208 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 191689 | |
| 7 | 171519 | |
| b | 147761 | |
| 0 | 109599 | |
| f | 105021 | 7.4% |
| 5 | 102310 | 7.2% |
| 4 | 94223 | 6.6% |
| e | 84046 | 5.9% |
| a | 72273 | 5.1% |
| 6 | 71850 | 5.1% |
| Other values (6) | 271917 |
| Distinct | 8 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24557.47565 |
| Minimum | 4096 |
|---|---|
| Maximum | 507904 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 4096 |
|---|---|
| 5-th percentile | 4096 |
| Q1 | 4096 |
| median | 4096 |
| Q3 | 8192 |
| 95-th percentile | 32768 |
| Maximum | 507904 |
| Range | 503808 |
| Interquartile range (IQR) | 4096 |
Descriptive statistics
| Standard deviation | 80934.54156 |
|---|---|
| Coefficient of variation (CV) | 3.2957191 |
| Kurtosis | 24.94119374 |
| Mean | 24557.47565 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.978015168 |
| Sum | 1091432448 |
| Variance | 6550400017 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) |
| 4096 | 28940 | |
| 8192 | 9275 | 20.9% |
| 28672 | 3110 | 7.0% |
| 32768 | 917 | 2.1% |
| 507904 | 917 | 2.1% |
| 225280 | 917 | 2.1% |
| 286720 | 366 | 0.8% |
| 212992 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 4096 | 28940 | |
| 8192 | 9275 | 20.9% |
| 28672 | 3110 | 7.0% |
| 32768 | 917 | 2.1% |
| 212992 | 2 | < 0.1% |
| 225280 | 917 | 2.1% |
| 286720 | 366 | 0.8% |
| 507904 | 917 | 2.1% |
| Value | Count | Frequency (%) |
| 507904 | 917 | 2.1% |
| 286720 | 366 | 0.8% |
| 225280 | 917 | 2.1% |
| 212992 | 2 | < 0.1% |
| 32768 | 917 | 2.1% |
| 28672 | 3110 | 7.0% |
| 8192 | 9275 | 20.9% |
| 4096 | 28940 |
| Distinct | 74 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22068.3459 |
| Minimum | 132 |
|---|---|
| Maximum | 504779 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 132 |
|---|---|
| 5-th percentile | 336 |
| Q1 | 877 |
| median | 2302 |
| Q3 | 4751 |
| 95-th percentile | 32686 |
| Maximum | 504779 |
| Range | 504647 |
| Interquartile range (IQR) | 3874 |
Descriptive statistics
| Standard deviation | 80911.62619 |
|---|---|
| Coefficient of variation (CV) | 3.666411002 |
| Kurtosis | 24.82677084 |
| Mean | 22068.3459 |
| Median Absolute Deviation (MAD) | 1731 |
| Skewness | 4.963146834 |
| Sum | 980805565 |
| Variance | 6546691253 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 571 | 4207 | 9.5% |
| 27856 | 3110 | 7.0% |
| 1846 | 3067 | 6.9% |
| 2302 | 2537 | 5.7% |
| 4751 | 2151 | 4.8% |
| 4388 | 2040 | 4.6% |
| 3038 | 1943 | 4.4% |
| 431 | 1800 | 4.1% |
| 954 | 1684 | 3.8% |
| 4728 | 1565 | 3.5% |
| Other values (64) | 20340 |
| Value | Count | Frequency (%) |
| 132 | 421 | 0.9% |
| 180 | 231 | 0.5% |
| 249 | 64 | 0.1% |
| 262 | 17 | < 0.1% |
| 268 | 48 | 0.1% |
| 311 | 826 | |
| 318 | 597 | 1.3% |
| 336 | 917 | |
| 364 | 5 | < 0.1% |
| 431 | 1800 |
| Value | Count | Frequency (%) |
| 504779 | 917 | 2.1% |
| 282996 | 366 | 0.8% |
| 223348 | 917 | 2.1% |
| 210737 | 2 | < 0.1% |
| 32686 | 917 | 2.1% |
| 27856 | 3110 | |
| 8150 | 1 | < 0.1% |
| 8130 | 96 | 0.2% |
| 8104 | 684 | 1.5% |
| 8097 | 18 | < 0.1% |
| Distinct | 166 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 917060.5216 |
| Minimum | 4096 |
|---|---|
| Maximum | 1847296 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 347.3 KiB |
Quantile statistics
| Minimum | 4096 |
|---|---|
| 5-th percentile | 45056 |
| Q1 | 819200 |
| median | 962560 |
| Q3 | 1073152 |
| 95-th percentile | 1277952 |
| Maximum | 1847296 |
| Range | 1843200 |
| Interquartile range (IQR) | 253952 |
Descriptive statistics
| Standard deviation | 283631.6676 |
|---|---|
| Coefficient of variation (CV) | 0.3092834779 |
| Kurtosis | 3.461868248 |
| Mean | 917060.5216 |
| Median Absolute Deviation (MAD) | 126976 |
| Skewness | -1.691877275 |
| Sum | 4.075783782 × 1010 |
| Variance | 8.044692286 × 1010 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 4096 | 917 | 2.1% |
| 782336 | 917 | 2.1% |
| 798720 | 917 | 2.1% |
| 794624 | 917 | 2.1% |
| 786432 | 917 | 2.1% |
| 36864 | 917 | 2.1% |
| 557056 | 917 | 2.1% |
| 552960 | 917 | 2.1% |
| 45056 | 917 | 2.1% |
| 802816 | 898 | 2.0% |
| Other values (156) | 35293 |
| Value | Count | Frequency (%) |
| 4096 | 917 | |
| 36864 | 917 | |
| 45056 | 917 | |
| 552960 | 917 | |
| 557056 | 917 | |
| 782336 | 917 | |
| 786432 | 917 | |
| 794624 | 917 | |
| 798720 | 917 | |
| 802816 | 898 |
| Value | Count | Frequency (%) |
| 1847296 | 1 | < 0.1% |
| 1630208 | 1 | < 0.1% |
| 1613824 | 1 | < 0.1% |
| 1609728 | 1 | < 0.1% |
| 1605632 | 1 | < 0.1% |
| 1601536 | 3 | |
| 1597440 | 2 | |
| 1589248 | 1 | < 0.1% |
| 1568768 | 1 | < 0.1% |
| 1560576 | 2 |
| Distinct | 1495 |
|---|---|
| Distinct (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 347.3 KiB |
| .text | 917 |
|---|---|
| .data | 917 |
| .pdata | 917 |
| .EXP | 917 |
| .rsrc | 917 |
| Other values (1490) |
Length
| Max length | 7 |
|---|---|
| Median length | 6 |
| Mean length | 5.500607506 |
| Min length | 4 |
Characters and Unicode
| Total characters | 244469 |
|---|---|
| Distinct characters | 30 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 559 ? |
|---|---|
| Unique (%) | 1.3% |
Sample
| 1st row | .text |
|---|---|
| 2nd row | .rdata |
| 3rd row | .data |
| 4th row | .pdata |
| 5th row | .EXP |
Common Values
| Value | Count | Frequency (%) |
| .text | 917 | 2.1% |
| .data | 917 | 2.1% |
| .pdata | 917 | 2.1% |
| .EXP | 917 | 2.1% |
| .rsrc | 917 | 2.1% |
| .reloc | 917 | 2.1% |
| .tbrtao | 917 | 2.1% |
| .rdata | 917 | 2.1% |
| .jubcj | 548 | 1.2% |
| .lrvm | 499 | 1.1% |
| Other values (1485) | 36061 |
Length
Histogram of lengths of the category
| Value | Count | Frequency (%) |
| text | 917 | 2.1% |
| reloc | 917 | 2.1% |
| rdata | 917 | 2.1% |
| tbrtao | 917 | 2.1% |
| data | 917 | 2.1% |
| rsrc | 917 | 2.1% |
| pdata | 917 | 2.1% |
| exp | 917 | 2.1% |
| jubcj | 548 | 1.2% |
| lrvm | 499 | 1.1% |
| Other values (1485) | 36061 |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 44444 | |
| r | 14056 | 5.7% |
| t | 12595 | 5.2% |
| a | 12308 | 5.0% |
| d | 9597 | 3.9% |
| c | 9291 | 3.8% |
| e | 9245 | 3.8% |
| x | 8819 | 3.6% |
| k | 8584 | 3.5% |
| p | 8353 | 3.4% |
| Other values (20) | 107177 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 197274 | |
| Other Punctuation | 44444 | 18.2% |
| Uppercase Letter | 2751 | 1.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 14056 | 7.1% |
| t | 12595 | 6.4% |
| a | 12308 | 6.2% |
| d | 9597 | 4.9% |
| c | 9291 | 4.7% |
| e | 9245 | 4.7% |
| x | 8819 | 4.5% |
| k | 8584 | 4.4% |
| p | 8353 | 4.2% |
| f | 7814 | 4.0% |
| Other values (16) | 96612 |
Uppercase Letter
| Value | Count | Frequency (%) |
| P | 917 | |
| X | 917 | |
| E | 917 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 44444 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 200025 | |
| Common | 44444 | 18.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 14056 | 7.0% |
| t | 12595 | 6.3% |
| a | 12308 | 6.2% |
| d | 9597 | 4.8% |
| c | 9291 | 4.6% |
| e | 9245 | 4.6% |
| x | 8819 | 4.4% |
| k | 8584 | 4.3% |
| p | 8353 | 4.2% |
| f | 7814 | 3.9% |
| Other values (19) | 99363 |
Common
| Value | Count | Frequency (%) |
| . | 44444 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 244469 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 44444 | |
| r | 14056 | 5.7% |
| t | 12595 | 5.2% |
| a | 12308 | 5.0% |
| d | 9597 | 3.9% |
| c | 9291 | 3.8% |
| e | 9245 | 3.8% |
| x | 8819 | 3.6% |
| k | 8584 | 3.5% |
| p | 8353 | 3.4% |
| Other values (20) | 107177 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | Unnamed: 0 | filename | win_count | sha256 | imp_hash | sec_chi2 | sec_entropy | sec_md5 | raw_size | virtual_size | virtual_address | sec_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 5245 | 5245 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 358466.16 | 6.12 | a43aef6b6f939e7959b81ec2e8806ecb | 32768 | 32686 | 4096 | .text |
| 1 | 5246 | 5246 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 951366.50 | 2.97 | 8d17b27fb2e22ce52dc7953821236946 | 8192 | 7802 | 36864 | .rdata |
| 2 | 5247 | 5247 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 2262192.25 | 7.38 | 41432e60924ed4a91548fe43265fa3d1 | 507904 | 504779 | 45056 | .data |
| 3 | 5248 | 5248 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 962731.13 | 0.48 | ee4f14b17ad059c6b83e4503a65b81f5 | 4096 | 336 | 552960 | .pdata |
| 4 | 5249 | 5249 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 66544.30 | 7.84 | c694e1149d43f57e80047a0bc754f9d3 | 225280 | 223348 | 557056 | .EXP |
| 5 | 5250 | 5250 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 842616.38 | 1.04 | a12e0f6f3ca72d1748b5dbba51b45e19 | 4096 | 976 | 782336 | .rsrc |
| 6 | 5251 | 5251 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 57874.69 | 5.33 | cf4f5746abe0554542e6173fc6438b6c | 8192 | 8061 | 786432 | .reloc |
| 7 | 5252 | 5252 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 2509 | 794624 | .tbrtao |
| 8 | 5253 | 5253 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 571 | 798720 | .jubcj |
| 9 | 5254 | 5254 | 20220329/2022032900/2022032900_10 | 991 | 3edfa8cca59d7e464270ef3b24bbefbc811cf305f4e80dbff061ec4ad7c18ea9 | fbcff5951ad0c204f4744c629548c6c6 | 1044480.00 | 0.00 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 2302 | 802816 | .ipo |
Last rows
| df_index | Unnamed: 0 | filename | win_count | sha256 | imp_hash | sec_chi2 | sec_entropy | sec_md5 | raw_size | virtual_size | virtual_address | sec_name | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 44434 | 5555039 | 5555039 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 311 | 1077248 | .gzlcc |
| 44435 | 5555040 | 5555040 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 431 | 1081344 | .ubgnm |
| 44436 | 5555041 | 5555041 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 431 | 1085440 | .pdvbd |
| 44437 | 5555042 | 5555042 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 2302 | 1089536 | .nkh |
| 44438 | 5555043 | 5555043 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 905 | 1093632 | .cftn |
| 44439 | 5555044 | 5555044 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 877 | 1097728 | .fmkp |
| 44440 | 5555045 | 5555045 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 2088960.0 | 0.0 | 0829f71740aab1ab98b33eae21dee122 | 8192 | 4388 | 1101824 | .zqdwjh |
| 44441 | 5555046 | 5555046 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 2088960.0 | 0.0 | 0829f71740aab1ab98b33eae21dee122 | 8192 | 7782 | 1110016 | .chon |
| 44442 | 5555047 | 5555047 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 2088960.0 | 0.0 | 0829f71740aab1ab98b33eae21dee122 | 8192 | 4751 | 1118208 | .nin |
| 44443 | 5555048 | 5555048 | 2022042101/2022042101_20 | 591567 | 0ecb88ad96e344604989c800d404f6253f26875fbe1c557e87cd58637f1a388b | fbcff5951ad0c204f4744c629548c6c6 | 1044480.0 | 0.0 | 620f0b67a91f7f74151bc5be745b7110 | 4096 | 730 | 1126400 | .rulqdd |